Wireless communication network traffic prediction is of great significance to operators in network construction, base station wireless resource management and user experience improvement. However, the existing centralized algorithm models face the problems of complexity and timeliness, so that it is difficult to meet the traffic prediction requirements of the whole city scale. Therefore, a distributed wireless traffic prediction framework under cloud-edge collaboration was proposed to realize traffic prediction based on single grid base station with low complexity and communication overhead. Based on the distributed architecture, a wireless traffic prediction model based on federated learning was proposed. Each grid traffic prediction model was trained synchronously, JS (Jensen-Shannon) divergence was used to select grid traffic models with similar traffic distributions through the center cloud server, and Federated Averaging (FedAvg) algorithm was used to fuse the parameters of the grid traffic models with similar traffic distributions, so as to improve the model generalization and describe the regional traffic accurately at the same time. In addition, as the traffic in different areas within the city was highly differentiated in features, on the basis of the algorithm, a federated training method based on coalitional game was proposed. Combined with super-additivity criteria, the grids were taken as participants in the coalitional game, and screened. And the core of the coalitional game and the Shapley value were introduced for profit distribution to ensure the stability of the alliance, thereby improving the accuracy of model prediction. Experimental results show that taking Short Message Service (SMS) traffic as an example, compared with grid-independent training, the proposed model has the prediction error decreased most significantly in the suburb, with a decline range of 26.1% to 28.7%, the decline range is 0.7% to 3.4% in the urban area, and 0.8% to 4.7% in the downtown area. Compared with the grid-centralized training, the proposed model has the prediction error in the three regions decreased by 49.8% to 79.1%.
Aiming at the problem of inaccurate prediction of edges and the farthest region in monocular image depth estimation, a monocular depth estimation method based on Pyramid Split attention Network (PS-Net) was proposed. Firstly, based on Boundary-induced and Scene-aggregated Network (BS-Net), Pyramid Split Attention (PSA) module was introduced in PS-Net to process the spatial information of multi-scale features and effectively establish the long-term dependence between multi-scale channel attentions, thereby extracting the boundary with sharp change depth gradient and the farthest region. Then, the Mish function was used as the activation function in the decoder to further improve the performance of the network. Finally, training and evaluation were performed on NYUD v2 (New York University Depth dataset v2) and iBims-1 (independent Benchmark images and matched scans v1) datasets. Experimental results on iBims-1 dataset show that the proposed network reduced 1.42 percentage points compared with BS-Net in measuring Directed Depth Error (DDE), and has the proportion of correctly predicted depth pixels reached 81.69%. The above proves that the proposed network has high accuracy in depth prediction.
Feature selection is a key step in data preprocessing for software defect prediction. Aiming at the problems of existing feature selection methods such as not significant dimension reduction performance and low classification accuracy of selected optimal feature subset, a feature selection method for software defect prediction based on Self-adaptive Hybrid Particle Swarm Optimization (SHPSO) was proposed. Firstly, combined with population partition, a self-adaptive weight update strategy based on Q-learning was designed, in which Q-learning was introduced to adaptively adjust the inertia weight according to the states of the particles. Secondly, to balance the global search ability in the early stage of the algorithm and the convergence speed in the later stage, the curve adaptivity based time-varying learning factors were proposed. Finally, a hybrid location update strategy was adopted to help particles jump out of the local optimal solution as soon as possible and increase the diversity of particles. Experiments were carried out on 12 public software defect datasets. The results show that the proposed method can effectively improve the classification accuracy of software defect prediction model and reduce the dimension of feature space compared with the method using all features, the commonly used traditional feature selection methods and the mainstream feature selection methods based on intelligent optimization algorithms. Compared with Improved Salp Swarm Algorithm (ISSA), the proposed method increases the classification accuracy by about 1.60% on average and reduces the feature subset size by about 63.79% on average. Experimental results show that the proposed method can select a feature subset with high classification accuracy and small size.
Aiming at the problems in Multi-scale Generative Adversarial Networks Image Inpainting algorithm (MGANII), such as unstable training in the process of image inpainting, poor structural consistency, insufficient details and textures of the inpainted image, an image inpainting algorithm of multi-scale generative adversarial network was proposed based on multi-feature fusion. Firstly, aiming at the problems of poor structural consistency and insufficient details and textures, a Multi-Feature Fusion Module (MFFM) was introduced in the traditional generator, and a perception-based feature reconstruction loss function was introduced to improve the ability of feature extraction in the dilated convolutional network, thereby supplying more details and texture features for the inpainted image. Then, a perception-based feature matching loss function was introduced into local discriminator to enhance the discrimination ability of the discriminator, thereby improving the structural consistency of the inpainted image. Finally, a risk penalty term was introduced into the adversarial loss function to meet the Lipschitz continuity condition, so that the network was able to converge rapidly and stably in the training process. On the dataset CelebA, compared with MANGII, the proposed multi-feature fusion image inpainting algorithm can converges faster. Meanwhile, the Peak Signal-to-Noise Ratio (PSNR) and Structural SIMilarity (SSIM) of the images inpainted by the proposed algorithm are improved by 0.45% to 8.67% and 0.88% to 8.06% respectively compared with those of the images inpainted by the baseline algorithms, and Frechet Inception Distance score (FID) of the images inpainted by the proposed algorithm is reduced by 36.01% to 46.97% than the images inpainted by the baseline algorithms. Experimental results show that the inpainting performance of the proposed algorithm is better than that of the baseline algorithms.
Privacy Preserving Utility Mining (PPUM) has problems of long sanitization time, high computational complexity, and high side effect. To solve these problems, a fast sanitization algorithm based on BCU-Tree and Dictionary (BCUTD) for high-utility mining was proposed. In the algorithm, a new tree structure called BCU-Tree was presented to store sensitive item information, and based on the bitwise operator coding model, the tree construction time and search space were reduced. The dictionary table was used to store all nodes in the tree structure, and only the dictionary table needed to be accessed when the sensitive item was modified. Finally, the sanitization process was completed. In the experiments on four different datasets, BCUTD algorithm has better performance on sanitization time and high side effect than Hiding High Utility Item First (HHUIF), Maximum Sensitive Utility-MAximum item Utility (MSU-MAU), and Fast Perturbation Using Tree and Table structures (FPUTT). Experimental results show that BCUTD algorithm can effectively speed up the sanitization process, reduce the side effect and computational complexity of the algorithm.
There are very few paired high and low resolution images in the real world. The traditional single image Super-Resolution (SR) methods typically use pairs of high-resolution and low-resolution images to train models, but these methods use the way of synthetizing dataset to obtain training set, which only consider bilinear downsampling as image degradation process. However, the image degradation process in the real word is complex and diverse, and traditional image super-resolution methods have poor reconstruction performance when facing real unknown degraded images. Aiming at those problems, a single image super-resolution method was proposed for real complex scenes. Firstly, high- and low-resolution images were captured by the camera with different focal lengths, and these images were registered as image pairs to form a dataset CSR(Camera Super-Resolution dataset) of various scenes. Secondly, to simulate the image degradation process in the real world as much as possible, the image degradation model was improved by the parameter randomization of degradation factors and the nonlinear combination degradation. Besides, the dataset of high- and low-resolution image pairs and the image degradation model were combined to synthetize training set. Finally, as the degradation factors were considered in the dataset, residual shrinkage network and U-Net were embedded into the benchmark model to reduce the redundant information caused by degradation factors in the feature space as much as possible. Experimental results indicate that compared with the BSRGAN (Blind Super-Resolution Generative Adversarial Network) method, under complex degradation conditions, the proposed method improves the PSNR by 0.7 dB and 0.14 dB, and improves SSIM by 0.001 and 0.031 respectively on the RealSR and CSR test sets. The proposed method has better objective indicators and visual effect than the existing methods on complex degradation datasets.
In order to reduce the regression test set and improve the efficiency of regression test in the Continuous Integration (CI) environment, a regression test suite selection method for the CI environment was proposed. First, the commits were prioritized based on the historical failure rate and execution rate of each test suite related to each commit. Then, the machine learning method was used to predict the failure rates of the test suites involved in each commit, and the test suite with the higher failure rate were selected. In this method, the commit prioritization technology and the test suite selection technology were combined to ensure the increase of the failure detection rate and the reduction of the test cost. Experimental results on Google’s open-source dataset show that compared to the methods with the same commit prioritization method and test suite selection method, the proposed method has the highest improvement in the Average Percentage of Faults Detected per cost (APFDc) by 1% to 27%; At the same cost of test time, the TestRecall of this method increases by 33.33 to 38.16 percentage points, the ChangeRecall increases by 15.67 to 24.52 percentage points, and the test suite SelectionRate decreases by about 6 percentage points.
Considering the lack of effective trend feature descriptors in existing methods, financial technical indicators such as Vertical Horizontal Filter (VHF) and Moving Average Convergence/Divergence (MACD) were introduced into power data analysis. An anomaly detection algorithm and a load forecasting algorithm using financial technical indicators were proposed. In the proposed anomaly detection algorithm, the thresholds of various financial technical indicators were determined based on statistics, and then the abnormal behaviors of user power consumption were detected using threshold detection. In the proposed load forecasting algorithm, 14 dimensional daily load characteristics related to financial technical indicators were extracted, and a Long Shot-Term Memory (LSTM) load forecasting model was built. Experimental results on industrial power data of Hangzhou City show that the proposed load forecasting algorithm reduces the Mean Absolute Percentage Error (MAPE) to 9.272%, which is lower than that of Autoregressive Integrated Moving Average (ARIMA), Prophet and Support Vector Machine (SVM) algorithms by 2.322, 24.175 and 1.310 percentage points, respectively. The results show that financial technical indicators can be effectively applied to power data analysis.
The emergence of RAMCloud has improved user experience of Online Data-Intensive (OLDI) applications. However, its energy consumption is higher than traditional cloud data centers. An energy-efficient strategy for disks under this architecture was put forward to solve this problem. Firstly, the fitness function and roulette wheel selection which belong to genetic algorithm were introduced to choose those energy-saving disks to implement persistent data backup; secondly, reasonable buffer size was needed to extend average continuous idle time of disks, so that some of them could be put into standby during their idle time. The simulation experimental results show that the proposed strategy can effectively save energy by about 12.69% in a given RAMCloud system with 50 servers. The buffer size has double impacts on energy-saving effect and data availability, which must be weighed.
The activities of the programmers including copy, paste and modify result in a lot of code clone in the software systems. However, the inconsistent change of code clone is the main reason that causes program error and increases maintenance costs in the evolutionary process of the software version. To solve this problem, a new research method was proposed. The mapping relationship between the clone groups was built at first. Then the theme of lineal cloning cluster was extracted using Latent Dirichlet Allocation (LDA) model. Finally, the inconsistent change probability of code clone was predicted. A software which contains eight versions was tested and an obvious discrimination was got. The experimental results show that the method can effectively predict the probability of inconsistent change and be used for evaluating quality and credibility of software.
In view of the problems that posture recognition based on vision requires a lot on environment and has low anti-interference capacity, a posture recognition method based on predefined bone was proposed. The algorithm detected human body by combining Kinect multi-scale depth and gradient information. And it recognized every part of body based on random forest which used positive and negative samples, built the body posture vector. According to the posture category, optimal separating hyperplane and kernel function were built by using improved support vector machine to classify postures. The experimental results show that the recognition rate of this scheme is 94.3%, and it has good real-time performance, strong anti-interference, good robustness, etc.
An fast image stitching algorithm based on improved Speeded Up Robust Feature (SURF) was proposed to overcome the real-time and robustness problems of the original SURF based stitching algorithms. The machine learning method was adopted to build a binary classifier, which identified the critical feature points obtained by SURF and removed the non-critical feature points. In addition, the Relief-F algorithm was used to reduce the dimension of the improved SURF descriptor to accomplish image registration. The weighted threshold fusion algorithm was adopted to achieve seamless image stitching. Several experiments were conducted to verify the real-time performance and robustness of the improved algorithm. Furthermore, the efficiency of image registration and the speed of image stitching were improved.